## Implementing Memory Failure Analysis in a Data Processing System

## **ABSTRACT**

5

10

15

A system includes a data processing core coupled to a system memory employing error correction code (ECC) circuitry. The core includes an indicator of when a correctable system memory error occurs and what address is associated with the error. A watchdog timer is instantiated on a system management device. Periodically, the timer prompts the management device to interrupt the processor and poll the error indicator to determine if a memory error has been detected. If an error is detected, the corresponding physical memory address is recorded. If a predetermined number of consecutive errors associated with a single memory address or range of addresses occurs, an alert is issued. In one embodiment, polling the error indicator is infrequent initially. As additional errors are detected, the polling frequency increases. At higher polling frequencies, the system may require a greater number of consecutive errors before taking additional action.